734 research outputs found

    Statistical analysis of grouped text documents

    Get PDF
    L'argomento di questa tesi sono i modelli statistici per l'analisi dei dati testuali, con particolare attenzione ai contesti in cui i campioni di testo sono raggruppati. Quando si ha a che fare con dati testuali, il primo problema è quello di elaborarli, per renderli compatibili dal punto di vista computazionale e metodologico con i metodi matematici e statistici prodotti e continuamente sviluppati dalla comunità scientifica. Per questo motivo, la tesi passa in rassegna i metodi esistenti per la rappresentazione analitica e l'elaborazione di campioni di dati testuali, compresi i "Vector Space Models", le "rappresentazioni distribuite" di parole e documenti e i "contextualized embeddings". Questa rassegna comporta la standardizzazione di una notazione che, anche all'interno dello stesso approccio di rappresentazione, appare molto eterogenea in letteratura. Vengono poi esplorati due domini di applicazione: i social media e il turismo culturale. Per quanto riguarda il primo, viene proposto uno studio sull'autodescrizione di gruppi diversi di individui sulla piattaforma StockTwits, dove i mercati finanziari sono gli argomenti dominanti. La metodologia proposta ha integrato diversi tipi di dati, sia testuali che variabili categoriche. Questo studio ha agevolato la comprensione sul modo in cui le persone si presentano online e ha trovato stutture di comportamento ricorrenti all'interno di gruppi di utenti. Per quanto riguarda il turismo culturale, la tesi approfondisce uno studio condotto nell'ambito del progetto "Data Science for Brescia - Arts and Cultural Places", in cui è stato addestrato un modello linguistico per classificare le recensioni online scritte in italiano in quattro aree semantiche distinte relative alle attrazioni culturali della città di Brescia. Il modello proposto permette di identificare le attrazioni nei documenti di testo, anche quando non sono esplicitamente menzionate nei metadati del documento, aprendo così la possibilità di espandere il database relativo a queste attrazioni culturali con nuove fonti, come piattaforme di social media, forum e altri spazi online. Infine, la tesi presenta uno studio metodologico che esamina la specificità di gruppo delle parole, analizzando diversi stimatori di specificità di gruppo proposti in letteratura. Lo studio ha preso in considerazione documenti testuali raggruppati con variabile di "outcome" e variabile di gruppo. Il suo contributo consiste nella proposta di modellare il corpus di documenti come una distribuzione multivariata, consentendo la simulazione di corpora di documenti di testo con caratteristiche predefinite. La simulazione ha fornito preziose indicazioni sulla relazione tra gruppi di documenti e parole. Inoltre, tutti i risultati possono essere liberamente esplorati attraverso un'applicazione web, i cui componenti sono altresì descritti in questo manoscritto. In conclusione, questa tesi è stata concepita come una raccolta di studi, ognuno dei quali suggerisce percorsi di ricerca futuri per affrontare le sfide dell'analisi dei dati testuali raggruppati.The topic of this thesis is statistical models for the analysis of textual data, emphasizing contexts in which text samples are grouped. When dealing with text data, the first issue is to process it, making it computationally and methodologically compatible with the existing mathematical and statistical methods produced and continually developed by the scientific community. Therefore, the thesis firstly reviews existing methods for analytically representing and processing textual datasets, including Vector Space Models, distributed representations of words and documents, and contextualized embeddings. It realizes this review by standardizing a notation that, even within the same representation approach, appears highly heterogeneous in the literature. Then, two domains of application are explored: social media and cultural tourism. About the former, a study is proposed about self-presentation among diverse groups of individuals on the StockTwits platform, where finance and stock markets are the dominant topics. The methodology proposed integrated various types of data, including textual and categorical data. This study revealed insights into how people present themselves online and found recurring patterns within groups of users. About the latter, the thesis delves into a study conducted as part of the "Data Science for Brescia - Arts and Cultural Places" Project, where a language model was trained to classify Italian-written online reviews into four distinct semantic areas related to cultural attractions in the Italian city of Brescia. The model proposed allows for the identification of attractions in text documents, even when not explicitly mentioned in document metadata, thus opening possibilities for expanding the database related to these cultural attractions with new sources, such as social media platforms, forums, and other online spaces. Lastly, the thesis presents a methodological study examining the group-specificity of words, analyzing various group-specificity estimators proposed in the literature. The study considered grouped text documents with both outcome and group variables. Its contribution consists of the proposal of modeling the corpus of documents as a multivariate distribution, enabling the simulation of corpora of text documents with predefined characteristics. The simulation provided valuable insights into the relationship between groups of documents and words. Furthermore, all its results can be freely explored through a web application, whose components are also described in this manuscript. In conclusion, this thesis has been conceived as a collection of papers. It aimed to contribute to the field with both applications and methodological proposals, and each study presented here suggests paths for future research to address the challenges in the analysis of grouped textual data

    Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study

    Full text link
    [EN] In Text Mining applications, count-based models are often used to represent text documents. When two document variables are available, i.e. an outcome and a grouping variable, the weight of a word for the documents may depend on the group memberships. The contribution of this work is to frame this context with a statistical approach, by modelling the corpus of documents with a Multivariate Binomial distribution (Hudson et al., 1986). The advantage of this solution is two-fold: it allows (1) to review, in a statistical framework, some term-weighting measures used in the literature (Samant et al., 2019), and (2) to simulate corpora with predefined characteristics by means of the Gaussian Copula method (Genest and McKay, 1986). This simulation is useful to investigate the ability of the existing measures, computed on the group-word interaction, to capture both the group-word relationship itself and the target-word association. Results from the simulation study show interesting relationships that can be exploited by nice visualization tools.Ricciardi, R.; Manisera, M. (2023). Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study. Editorial Universitat Politècnica de València. 97-98. http://hdl.handle.net/10251/201768979

    Bilateral dimorphism of Loewenthal's gland in young male albino rats: an ultrastructural investigation

    Get PDF
    This study represents a further contribution to our knowledge about the structure of Loewenthal's gland. There are several divergences in the available literature on the topic, concerning both the histological and ultrastructural findings. However, in these studies, the authors did not take into account the potential influence of a putative side-dependent dimorphism previously reported by us. We therefore carried out histological and electronmicroscopic observations specifically aimed at evaluating the importance of the gland shape for its structure. In particular, in male albino rats aged 70-120 days, we compared the structure of the left and right glands. Depending on the side undergoing morphological investigation, we observed differences in the acini, cells, nuclei, endoplasmic reticulum, Golgi apparatus and granular content. Apart from slight individual differences, we found that structural variations were most frequently observed in glands displaying a more evident macroscopic side-specific dimorphism. Our findings demonstrate that several conflicting data in the literature dealing with the structure of Loewenthal's glands might be explained by the morphofunctional side-dependent dimorphism of the organ

    Cervical mucus proteome in endometriosis

    Get PDF
    Additional file 1: Table S1. Identified proteins in CM in the group of controls and in patients affected by endometriosis

    Spontaneous vertebral aspergillosis, the state of art: a systematic literature review

    Get PDF
    Objective: Vertebral aspergillosis are quite rare conditions, often misdiagnosed, that requires long-term antibiotic therapy and, sometimes, surgical treatments. The present investigations was aimed to investigate epidemiology, clinical-radiological aspects, treatment protocols, and outcomes of Aspergillus-mediated vertebral osteomyelitis. Methods: A systematic review of the pertinent English Literature according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines was performed. The research was conducted on Cochrane library, MEDLINE, PubMed and Scopus using as search-terms “Aspergillus”, “vertebral osteomyelitis”, “spondylodiscitis”, “spine infection”. A case of vertebral apsergillosis conservatively managed was also reported. Results: Eighty-nine articles were included in our systematic review. Including the reported case, our analysis covered 112 cases of vertebral aspergillosis. Aspergillus fumigatus was isolated in 68 cases (61.2%), Aspergillus flavus in 14 (12.6%), Aspergillus terreus in 4 (3.6%), Aspergillus nidulans in 2 (1.8%). Seventy-three patients (65.7%) completely recovered at last follow-up evaluation; in 7 (6.3%) patients radiological signs of chronic infection were reported, whereas 32 (28.8%) patients died during the follow-up. Conclusion: This systematic review summarized the state of the art on vertebral aspergillosis, retrieving data on clinical features, diagnostic criteria and current limitations, treatment alternatives and their outcomes

    How skill expertise shapes the brain functional architecture: an fMRI study of visuo-spatial and motor processing in professional racing-car and naïve drivers

    Get PDF
    The present study was designed to investigate the brain functional architecture that subserves visuo-spatial and motor processing in highly skilled individuals. By using functional magnetic resonance imaging (fMRI), we measured brain activity while eleven Formula racing-car drivers and eleven ‘naïve’ volunteers performed a motor reaction and a visuo-spatial task. Tasks were set at a relatively low level of difficulty such to ensure a similar performance in the two groups and thus avoid any potential confounding effects on brain activity due to discrepancies in task execution. The brain functional organization was analyzed in terms of regional brain response, inter-regional interactions and blood oxygen level dependent (BOLD) signal variability. While performance levels were equal in the two groups, as compared to naïve drivers, professional drivers showed a smaller volume recruitment of task-related regions, stronger connections among task-related areas, and an increased information integration as reflected by a higher signal temporal variability. In conclusion, our results demonstrate that, as compared to naïve subjects, the brain functional architecture sustaining visuo-motor processing in professional racing-car drivers, trained to perform at the highest levels under extremely demanding conditions, undergoes both ‘quantitative’ and ‘qualitative’ modifications that are evident even when the brain is engaged in relatively simple, non-demanding tasks. These results provide novel evidence in favor of an increased ‘neural efficiency’ in the brain of highly skilled individuals

    It’s not all in your car: functional and structural correlates of exceptional driving skills in professional racers

    Get PDF
    Driving is a complex behavior that requires the integration of multiple cognitive functions. While many studies have investigated brain activity related to driving simulation under distinct conditions, little is known about the brain morphological and functional architecture in professional competitive driving, which requires exceptional motor and navigational skills. Here, 11 professional racing-car drivers and 11 “naïve” volunteers underwent both structural and functional brain magnetic resonance imaging (MRI) scans. Subjects were presented with short movies depicting a Formula One car racing in four different official circuits. Brain activity was assessed in terms of regional response, using an Inter-Subject Correlation (ISC) approach, and regional interactions by mean of functional connectivity. In addition, voxel-based morphometry (VBM) was used to identify specific structural differences between the two groups and potential interactions with functional differences detected by the ISC analysis. Relative to non-experienced drivers, professional drivers showed a more consistent recruitment of motor control and spatial navigation devoted areas, including premotor/motor cortex, striatum, anterior, and posterior cingulate cortex and retrosplenial cortex, precuneus, middle temporal cortex, and parahippocampus. Moreover, some of these brain regions, including the retrosplenial cortex, also had an increased gray matter density in professional car drivers. Furthermore, the retrosplenial cortex, which has been previously associated with the storage of observer-independent spatial maps, revealed a specific correlation with the individual driver's success in official competitions. These findings indicate that the brain functional and structural organization in highly trained racing-car drivers differs from that of subjects with an ordinary driving experience, suggesting that specific anatomo-functional changes may subtend the attainment of exceptional driving performance

    Aniseed, Pimpinella anisum, as a source of new agrochemicals: phytochemistry and insights on insecticide and acaricide development

    Get PDF
    Pimpinella anisum L. (Apiaceae), known around the world as aniseed, is a widely cultivated crop, native of the sub-Mediterranean area. Its essential oil (EO) is exploitable in different fields such as food and beverages, pharmaceutics, cosmetics, and nutraceuticals. Regardless of the geographic origin, the EO exhibited consistent transanethole predominancy. Among the numerous biological properties exerted by aniseed EO, its antimicrobial, antifungal, insecticidal, and acaricidal effects have been extensively investigated for the formulation of biopesticides against larvae and adults of various pests and vectors. Hereafter, the published data on the insecticidal and acaricidal activity of aniseed EO and its major compounds on agricultural pests, stored-product pests, and arthropods of medical and veterinary interest is reviewed. For each study, the arthropod and the developmental stage on which the aniseed EO or the aniseed EO-based formulation were tested, the mode of action, the main constituents, and the exerted mortality, as well as the toxicity to non-target organisms and the possible sub-lethal effects are reported. The advantages of the possible use of aniseed EO as a biopesticide are analysed, as well as the current weaknesses and the critical points to be overcome to open the doors to the industrial utilization of Apiaceae EOs by the agrochemical industry

    Identification of Novel Putative Urinary Markers of Endometriosis by High-Resolution Quantitative Proteomics

    Get PDF
    Endometriosis is a chronic gynecological inflammatory disease characterized by the presence of functional endometrial glands..
    corecore